Conversation
Without trimming the leading and trailing newlines, Vertica would fail to parse the
compiled SQL. For example, `models/edr/dbt_artifacts/dbt_columns`
compiles the following SQL, via `elementary.get_dbt_columns_empty_table_query`, `empty_table` and `empty_column`:
```sql
select * from (
select
cast('dummy_string' as varchar(4096)) as unique_id
,
cast('dummy_string' as varchar(4096)) as parent_unique_id
,
cast('dummy_string' as varchar(4096)) as name
,
cast('dummy_string' as varchar(4096)) as data_type
,
cast('this_is_just_a_long_dummy_string' as varchar(4096)) as tags
,
cast('this_is_just_a_long_dummy_string' as varchar(4096)) as meta
,
cast('dummy_string' as varchar(4096)) as database_name
,
cast('dummy_string' as varchar(4096)) as schema_name
,
cast('dummy_string' as varchar(4096)) as table_name
,
cast('this_is_just_a_long_dummy_string' as varchar(4096)) as description
,
cast('dummy_string' as varchar(4096)) as resource_type
,
cast('dummy_string' as varchar(4096)) as generated_at
,
cast('dummy_string' as varchar(4096)) as metadata_hash
) as empty_table
where 1 = 0
```
which would cause
```
SQL Error [4856] [42601]: [Vertica][VJDBC](4856) ERROR: Syntax error at or near ")" at character 1
```
By trimming the newlines, the SQL is much tighter:
```sql
select * from (
select
cast('dummy_string' as varchar(4096)) as unique_id,
cast('dummy_string' as varchar(4096)) as parent_unique_id,
cast('dummy_string' as varchar(4096)) as name,
cast('dummy_string' as varchar(4096)) as data_type,
cast('this_is_just_a_long_dummy_string' as varchar(4096)) as tags,
cast('this_is_just_a_long_dummy_string' as varchar(4096)) as meta,
cast('dummy_string' as varchar(4096)) as database_name,
cast('dummy_string' as varchar(4096)) as schema_name,
cast('dummy_string' as varchar(4096)) as table_name,
cast('this_is_just_a_long_dummy_string' as varchar(4096)) as description,
cast('dummy_string' as varchar(4096)) as resource_type,
cast('dummy_string' as varchar(4096)) as generated_at,
cast('dummy_string' as varchar(4096)) as metadata_hash
) as empty_table
where 1 = 0
```
and this runs in Vertica just fine.
This fixed 4 or 5 errors when running in my test project.
WalkthroughAdds Vertica support across CI and tests: workflow matrix and inputs, conditional dbt-vertica install and Vertica startup steps, a docker-compose file for Vertica, two Vertica-specific macros (timeadd and escape_special_chars), and minor Jinja whitespace tightening. Changes
Sequence Diagram(s)sequenceDiagram
participant User as User / CI Trigger
participant WF as GitHub Workflow
participant Installer as Adapter Installer
participant Compose as docker-compose-vertica
participant DBT as dbt Runner
Note over User,WF: workflow_dispatch / workflow_call with inputs.warehouse-type="vertica"
User->>WF: inputs (warehouse-type=vertica, dbt-vertica-version)
WF->>Installer: if warehouse-type == "vertica" -> Install dbt-vertica
Installer-->>WF: dbt-vertica installed
WF->>Compose: Start Vertica (docker-compose-vertica.yml, env SCHEMA_NAME, creds)
Compose-->>WF: Vertica healthy (healthcheck)
WF->>DBT: write profiles (replace <SCHEMA_NAME>) and run dbt commands
DBT-->>WF: test results
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes
Possibly related PRs
Suggested reviewers
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 inconclusive)
✅ Passed checks (2 passed)
✨ Finishing touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
👋 @jc00ke |
|
I'm going to debug in our own Actions so no need to run the workflows here yet. I'll report back when I think it's actually ready to review! |
|
It looks like the contents of |
This should be a lot faster than pulling from docker.io
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (1)
.github/workflows/test-warehouse.yml (1)
136-142: Standardize if-condition syntax for consistency.Line 137 uses
if: ${{ inputs.warehouse-type == 'vertica' }}(with${{ }}wrapper), while earlier conditional steps (lines 103, 108, 113, 118) useif: inputs.warehouse-type == ...(without wrapper). Both syntaxes are valid in GitHub Actions, but this inconsistency reduces readability. Consider aligning line 137 and line 142 with the existing style used throughout the file.Example fix to match existing style:
- - name: Install dbt-vertica - if: ${{ inputs.warehouse-type == 'vertica' }} + - name: Install dbt-vertica + if: inputs.warehouse-type == 'vertica'Alternatively, if the
${{ }}wrapper is preferred, update all earlier if-conditions to be consistent.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (1)
.github/workflows/test-warehouse.yml(6 hunks)
🔇 Additional comments (3)
.github/workflows/test-warehouse.yml (3)
20-20: Input definitions look good.Vertica is properly added to warehouse-type options, and dbt-vertica-version is consistently defined across workflow_dispatch and workflow_call with sensible defaults.
Also applies to: 34-58
163-187: Verify that Vertica environment variables are correctly consumed.Lines 174–182 set Vertica-specific environment variables and export them to
GITHUB_ENVfor persistence across steps. However, these variables must be correctly referenced by:
- The
docker-compose-vertica.ymlfile (if they're passed to the container)- The dbt profiles template (if profiles.yml uses them instead of direct sed substitution)
The flow is logically sound—SCHEMA_NAME is computed once and reused—but I need to confirm that the downstream consumers actually expect these environment variables.
Please verify:
- Does
docker-compose-vertica.ymlreferenceVERTICA_PORT,VERTICA_HOST,VERTICA_USER,VERTICA_PASS,VERTICA_DATABASE, andVERTICA_SCHEMAas environment variables?- Are the hardcoded values (port 5433, user
dbadmin, passwordvertica, databaseelementary_tests) the correct defaults for your Vertica test setup?- If
profiles.ymlis template-based and uses<SCHEMA_NAME>placeholders, does it also expect the other VERTICA_* variables, or are they only for the docker-compose startup?
184-187: Vertica startup step follows the established pattern.The docker-compose startup is appropriately placed after environment setup and follows the pattern used by other services (Trino, Dremio). The detached mode and working directory are consistent with existing steps. Note the if-condition style inconsistency mentioned in the earlier review comment.
I'm seeing `Database Error: [Errno 32] Broken pipe` in the `Check DWH connection` step.
There was a problem hiding this comment.
Actionable comments posted: 2
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (2)
.github/workflows/test-warehouse.yml(6 hunks)integration_tests/docker-compose-vertica.yml(1 hunks)
🔇 Additional comments (4)
integration_tests/docker-compose-vertica.yml (2)
20-21: Verify deployment mode configuration.The
deploy: mode: globalsetting is typically used in Docker Swarm mode to run a service on all nodes. For local development or CI environments using docker-compose, this setting is usually unnecessary and should either be removed or explicitly documented why it's needed.For a single-instance Vertica container in CI, remove the deploy section:
- deploy: - mode: globalOr document why this is necessary if it serves a specific purpose in your deployment model.
4-5: Remove hardcoded credentials from docker-compose file.Credentials should not be committed to source control. Since this file is in the repository, the hardcoded
VERTICA_USER,VERTICA_PASS, and app credentials are exposed and represent a security risk.Additionally, the healthcheck command on Line 31 embeds credentials directly in the command string, which may be logged or exposed in logs/CI output.
Recommended approach: Use environment variable substitution to allow credentials to be injected via CI secrets or
.envfiles:environment: - VERTICA_USER: dbadmin - VERTICA_PASS: vertica + VERTICA_USER: ${VERTICA_USER:-dbadmin} + VERTICA_PASS: ${VERTICA_PASS:-vertica} VERTICA_HOST: localhost VERTICA_PORT: 5433 VERTICA_DATABASE: elementary_tests VERTICA_SCHEMA: ${SCHEMA_NAME} - APP_DB_USER: ${VERTICA_USER} - APP_DB_PASSWORD: ${VERTICA_PASS} + APP_DB_USER: ${APP_DB_USER:-${VERTICA_USER:-dbadmin}} + APP_DB_PASSWORD: ${APP_DB_PASSWORD:-${VERTICA_PASS:-vertica}}And update the healthcheck to avoid embedding credentials:
healthcheck: - test: ["CMD-SHELL", "/opt/vertica/bin/vsql -U dbadmin -w vertica -c 'SELECT 1;'"] + test: ["CMD-SHELL", "/opt/vertica/bin/vsql -U $${VERTICA_USER} -w $${VERTICA_PASS} -c 'SELECT 1;'"]Note: In docker-compose, use
$$to escape environment variables in the shell command.Also applies to: 10-11
⛔ Skipped due to learnings
Learnt from: GuyEshdat Repo: elementary-data/dbt-data-reliability PR: 838 File: integration_tests/docker-compose-dremio.yml:36-44 Timestamp: 2025-08-10T11:29:23.299Z Learning: When Docker Compose configurations are taken from official repositories (like dbt-dremio) and are working correctly in tests, prefer to keep them unchanged to maintain consistency with the official implementation, even if there are potential portability concerns with shell features like `/dev/tcp`.Learnt from: GuyEshdat Repo: elementary-data/dbt-data-reliability PR: 838 File: integration_tests/docker/dremio/dremio-setup.sh:3-11 Timestamp: 2025-08-10T11:29:20.155Z Learning: When Docker setup scripts or configuration files are taken from official repositories (like dbt-dremio), they should be kept unchanged to maintain compatibility with the upstream source and ensure easier updates in the future, even if there are potential improvements that could be made..github/workflows/test-warehouse.yml (2)
163-165: Verify profiles.yml template consistency across all warehouse types.The SCHEMA_NAME substitution now uses a sed command to replace
<SCHEMA_NAME>placeholder in the profiles.yml content. This assumes all warehouse profiles (including Vertica profiles stored insecrets.CI_PROFILES_YML) contain this placeholder.Please verify that the Vertica profile in your CI secrets (
secrets.CI_PROFILES_YML) includes<SCHEMA_NAME>as a placeholder that can be substituted, consistent with the approach used for other warehouses. You can check this by examining your stored secrets configuration if accessible, or by confirming the placeholder is used consistently.
184-194: Verify Vertica startup and readiness check logic.The startup steps look correct:
- Line 187 references the correct docker-compose file (
docker-compose-vertica.yml)- Line 193 uses docker inspect to check container health status, which aligns with the healthcheck defined in docker-compose (lines 30-34)
- The timeout of 60 seconds matches the max healthcheck time (5s × 10 retries = 50s)
Please verify that:
- The container name in the docker inspect command (
vertica) matches thecontainer_nameindocker-compose-vertica.yml(it should—currently both arevertica)- The docker healthcheck status can reliably report as "healthy" after Vertica is fully initialized and accepting connections
- The 60-second timeout is sufficient for CI environments to start Vertica (you may need to adjust based on observed startup times in CI runs)
I use this for local dev via `DBT_PROFILES_DIR="path/to/.github/fixtures/" and for GitHub Actions secret `CI_PROFILES_YML`. Linux+Wayland: `base64 .github/fixtures/profiles.yml | wl-copy` MacOS: `base64 .github/fixtures/profiles.yml | pbcopy`
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (1)
.github/fixtures/profiles.yml (1)
1-29: Consider using YAML anchors to eliminate profile duplication.The
elementaryandelementary_testsprofiles are structurally identical. Consider using YAML anchors and aliases to reduce duplication and improve maintainability:.vertica_base: &vertica_base type: vertica host: "{{ env_var('VERTICA_HOST') }}" port: "{{ env_var('VERTICA_PORT') | as_number }}" username: "{{ env_var('VERTICA_USER') }}" password: "{{ env_var('VERTICA_PASS') }}" database: "{{ env_var('VERTICA_DATABASE') }}" schema: "{{ env_var('SCHEMA_NAME') }}" connection_load_balance: false retries: 2 threads: 4 elementary: target: vertica outputs: vertica: *vertica_base elementary_tests: target: vertica outputs: vertica: *vertica_base
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (3)
.github/fixtures/profiles.yml(1 hunks).gitignore(1 hunks)integration_tests/docker-compose-vertica.yml(1 hunks)
🔇 Additional comments (5)
.gitignore (1)
7-7: Good practice to ignore fixture user files.Gitignoring
.github/fixtures/.user.ymlis appropriate, as this file likely contains ephemeral or credentials-related data that should not be committed. This aligns with the security concern noted in the PR comments regarding fixture credential management.integration_tests/docker-compose-vertica.yml (3)
20-21: Clarify the purpose ofdeploy: mode: global.The
deploy: mode: globalsetting is typically used for swarm-mode multi-node deployments. In a local testing context, this may be unnecessary or could cause unexpected behavior. Confirm whether this is intentional or should be removed for single-service test setup.
1-36: Ensure environment variable naming aligns across docker-compose and profiles.The docker-compose defines
VERTICA_SCHEMA(line 9) using${SCHEMA_NAME}, but.github/fixtures/profiles.ymlreferencesSCHEMA_NAMEdirectly in the schema field (lines 11, 26). Verify that the environment variable naming is consistent across both files and that CI/CD workflows properly setSCHEMA_NAMEfor both to reference.
16-16: Replace the third-party Vertica image with the official Vertica CE image.The image
ghcr.io/ratiopbc/vertica-ceis not an official Vertica (Community Edition) image; the official container images arevertica/vertica-ceand related images from Vertica/OpenText repositories. Using a third-party, unmaintained image for CI/CD testing introduces maintenance and security risks. Update line 16 to use the official image:image: vertica/vertica-ceVerify the image tag and pull authentication in the official Vertica documentation to ensure compatibility with your test environment.
⛔ Skipped due to learnings
Learnt from: GuyEshdat Repo: elementary-data/dbt-data-reliability PR: 838 File: integration_tests/docker-compose-dremio.yml:36-44 Timestamp: 2025-08-10T11:29:23.299Z Learning: When Docker Compose configurations are taken from official repositories (like dbt-dremio) and are working correctly in tests, prefer to keep them unchanged to maintain consistency with the official implementation, even if there are potential portability concerns with shell features like `/dev/tcp`.Learnt from: GuyEshdat Repo: elementary-data/dbt-data-reliability PR: 838 File: integration_tests/docker/dremio/dremio-setup.sh:3-11 Timestamp: 2025-08-10T11:29:20.155Z Learning: When Docker setup scripts or configuration files are taken from official repositories (like dbt-dremio), they should be kept unchanged to maintain compatibility with the upstream source and ensure easier updates in the future, even if there are potential improvements that could be made..github/fixtures/profiles.yml (1)
7-7: No issues found—code follows dbt-vertica best practices.The
| as_numberfilter is a built-in dbt/Jinja filter that dbt-vertica explicitly recommends for converting environment variables to integers for the port parameter. The code at lines 7 and 22 is correct and matches official documentation.
|
This is the most successful CI run I've gotten so far, and I'd say at this point I'm out of my depth when it comes to dbt and elementary. If there are simple ways to test these failures, I can start looking into them, but I'd need some guidance. There's probably a few things that will fix most of those failures. Vertica is finicky about newlines and empty spaces at the beginning of lines, see 8b84461 for example. |
|
Hey @jc00ke ! Was great seeing you in Coalesce an appreciate the contribution 🙌 |
|
This pull request is stale because it has been open for too long with no activity. |
|
@NoyaOffer I'm hoping to pick this back up soon. Any feedback I can incorporate into this PR? |
|
Hi @jc00ke sorry for the long delay Meanwhile I've pushed your changes to a separate PR, and adapted a bit as we made some changes to the test infra: Looks like the Vertica run there timed out (and failed but I can't see any logs yet) Overall usually the most effective approach is to get to a working local setup with the docker, and iteratively get all the integration tests to pass (or, if you think some features can't be supported in Vertica, add skip markers for it). Now with Claude code / cursor / your favorite AI tool it can actually be quite effective to automate the feedback loop (claude makes changes -> runs tests -> fixes -> runs tests -> etc). |
I'm only seeing this option for user-owned repos; our fork is an org-owned repo. I'm willing to put in the work to reconcile the issues to the best of my ability, but that might require me to pull your PR changes into mine? Or you give me access to your PR? There might be some changes we want/need to back out too, like 731ca2f. Until we get the actual Vertica adapter up to 1.10+, we need to install 1.8.5. Total bummer, I know! |
|
@jc00ke regarding the vertica version - I don't think this is actually needed - the CI will install the latest available version. You can see that it installed 1.8.5 in the CI run there. And actually - I've let Devin (autonomous AI agent we're using) to take a crack at getting it to work. It's been very successful with other adapters - and it helps when it is docker based. (It won't be able to operate on a fork PR so it's actually good to keep the separate PR for now) |
|
@jc00ke Devin was very successful :) WDYT? I can probably merge that PR directly. Just one thing - I noticed you used a docker image that you made yourself (I think?) - would it be possible to supply us with the dockerfile so we'll maintain it? we can include it in the PR. |
|
Nice! I'll try to look at it tonight or tomorrow. My recollection is that I pushed the image that I had pulled from opentext after they took theirs down. Now that I'm thinking of it, I don't think I ever had a Dockerfile, but let me double check tomorrow at the office. |
|
Awesome thanks! Worst case we can always do fixes in a followup release. |
|
I merged the other PR, just so we can move forward with the release (your commits appear in that PR, so you still appear as a contributor) Thanks a lot for the contribution! |
I checked and I never had the original I'm trying to test the changes locally, will report back. Thanks again for all your help! |
|
I'm trying to test and I'm getting 145 failed tests... but I'm likely not familiar enough with Python and dbt to have things set up correctly. The simplest setup I could make is to set elementary_tests:
target: postgres
outputs:
vertica: &vertica
type: vertica
host: localhost
port: 5433
username: dbadmin
password: vertica
database: elementary_tests
schema: xxx
connection_load_balance: false
retries: 2
threads: 4
elementary:
target: postgres
outputs:
vertica:
!!merge <<: *vertica
schema: xxx_elementaryI based this on what CI would generate. Maybe it's wrong? Any extra guidance would be appreciated. |
|
@jc00ke looks indeed like what we have in the CI. Is the docker running successfully with vertica available? |
|
Sorry, I wasn't able to get to this this week and I'm OOO next week. I'll get the info the week after because something is definitely not right (and it's most likely my fault! 😉) Vertica was available; I remember that. |
Hi there!
I spoke with the team at Coalesce about adding Vertica support. This is my initial attempt, and TBH, it's my initial foray into dbt packages, so I appreciate any and all help provided! 🙏
Summary by CodeRabbit
New Features
Tests
Chores
Style